CONVERGENCE RESULTS FOR SYSTEMS OF LINEAR FORMS ON 
CYCLIC GROUPS, AND PERIODIC NILSEQUENCES 



PABLO CANDELA AND OLOF SISASK 



Abstract. Given a positive integer N and real number a £ [0, 1], let m(a, N) denote 
the minimum, over all sets A C Zjv of size at least aN, of the normalized count of 3- 
term arithmetic progressions contained in A. A theorem of Croot states that m(a, N) 
converges as N — >• oo through the primes. This theorem is extended here from 3-term 
progressions to fc-term progressions for general k, and further to all systems of integer 
linear forms of finite complexity. A similar extension is also obtained for a related 
convergence result of the authors that deals with the maximum densities of sets free 
of solutions to certain linear equations. The results rely on a regularity method for 
functions on finite cyclic groups that is framed here in terms of periodic nilsequences. 
To this end we use recent results in higher-order Fourier analysis, in particular some 
regularity results of Szegedy (relying on his joint work with Camarena) combined with 
equidistribution results of Green and Tao. 



1. Introduction 

This paper is concerned with the occurrence of linear configurations in subsets of fi- 
nite cyclic groups. By a linear configuration in a set A C 1, N we mean a tuple of elements 
of A that solve a given homogeneous system of linear equations with integer coefficients. 
Such configurations can also be described as images (<^i(n), <^ 2 (n), . . . ,(p t (n)) 6 A 1 of 
elements n e under a homomorphism — > H} N given by a system of linear forms 
ipi, ip 2 , ■ ■ ■ , ft '■ %> D ~ >" We are interested in counting such configurations, especially 
under the weak assumption that the density \A\/N of A in Zjv is fixed. To this end we 
set up the following notation. 

Definition 1.1 (Solution measure). Let D,t > 1 be integers and let <P = (ipi, . . . ,ip t ) 
be a system of linear forms tpi, . . . , ft : 7L D — > Z. For any function / : Z^r — > C we write 

^(/):=E neZ p/(^(n)).../(^(n)), 
referring to this as the solution measure of / across 
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When / is the indicator function 1^ of a set A C "L N (that is f(x) = 1 if x G A and 
f(x) = otherwise), the quantity S$(A) := S$(1a) is simply a normalized count of the 
configurations corresponding to $ in A. For example, if <P = ni + n 2 , rii + 2n 2 , rii + 
3^2), then S#(A) is the number of 4-term arithmetic progressions in A, divided by N 2 ~^ 

Such solution counts have been treated in numerous works. The simplest non-trivial 
case is when the linear forms describe the solution set of a single linear equation. A 
central example is the system of forms 3AP := (m, ni+n2, n<i + 2^2) determining 3-term 
arithmetic progressions. These are solutions to the equation x — 2y + z = 0, and it is 
a classical result of Roth [19] that for any a > and iV > N (a), any set A C Zjv 
of density at least a contains a non-trivial 3-term progression (i.e. one with 112 7^ 0). 
Combined with a short averaging argument of Varnavides [2H] , this in fact implies that 
S3ap(A) > c(a), where c(a) > for non-zero a. Moreover, Croot [I] proved that the 
best possible lower bound behaves nicely for prime moduli N, in the following sense. 

Theorem 1.2 (Croot's limit). Fix a G [0, 1], and for any positive integer N set 

m 3AP (a,N):= min 5 , 3A p(v4). 

Then m3Ap(a, N) converges as N — > 00 through the primes. 

Croot also showed that m3Ap(a, N) can fail to converge if iV is allowed to tend to 
00 over the odd numbers [H Theorem 2]. This failure comes from integers sharing some 
fixed factor, so it is natural to address it by restricting iV to the primes. 

A central tool in Croot's proof was the classical Fourier transform, and his argu- 
ment can be viewed as an instance of what is now often referred to as the regularity 
method in arithmetic combinatorics. Various Fourier-analytic versions of this method, 
consisting roughly in using the dominant Fourier coefficients of an additive set to obtain 
information on the set's additive structure, have been applied successfully to numerous 
other combinatorial problems; see for instance [HI [lj [7] and also [2^, Chapter 4] . In 
the last decade, the scope of this method has been considerably widened by the devel- 
opment of a generalization of Fourier analysis known as higher-order Fourier analysis. 
There have been several recent applications of this theory, accompanying advances in 
the theory itself [HI [26] . This paper aims to contribute to this process and illustrate fur- 
ther the applicability of the theory, in connection with Theorem 11.21 Indeed, while the 
classical Fourier-analytic regularity method as used by Croot is known not to be helpful 
for studying the analogue of m3Ap(a, N) for longer progressions, we shall show that the 

More generally, one sees easily that S$(A) — \A l n Im^|/|Im^|, where <P denotes the homomorphism 
— > Z* N mentioned above (by a slight abuse of notation) . 
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higher-order theory yields a generalization of Theorem 11.21 to /c-term progressions for 
any k > 4. To this end, in particular we shall adapt parts of the work of Green and Tao 
[H] and combine this with results of Szegedy [22] ■ The generalization of Theorem 11.21 to 
longer progressions is then a special case of the following result. 

Theorem 1.3. Fix a G [0, 1] and let <P be a system of integer linear forms, any two of 
which are linearly independent. Set, for every positive integer N , 

m<p(a,N):= min S&(A). 

ACZ N ,\A\>aN 

Then m<p(a, N) converges as N — >• oo through primes. 

The pairwise linear-independence assumption ensures that the configurations have 
finite complexity in the sense of [9]. We make some further remarks on this assumption 
in Appendix [D] 

In addition to the minimum number of configurations in sets of a given density, an- 
other central quantity of interest in this area is the maximum density of a set containing 
no configurations whatsoever. 

Definition 1.4. Given a system <& of linear forms ipi, . . . ,ip t : TL D — > Z and a positive 
integer N, we say A C Zjy is <P-free if A does not contain any configurations determined 
by $, that is if A 1 n $(Z#) = 0. We define 

dJZ N ) := max \A\/N. 

ACZ N ,A is <Z>-free 

If J 7 is a finite family of such systems, we say A C Zjy is J 1 -free if A is <P-free for every 
<P G J 7 , and we define 

djr(Z N ):= max \A\/N. 

ACZjv, A is J 7 - free 

Our main result concerning these quantities is the following theorem which extends 
Theorem 1.3]. 

Theorem 1.5. Let J 7 be a finite family of systems of linear forms, in each of which 
the forms are pairwise linearly independent. Then dj{7L-ti) converges as N — >■ oo over 
primes. 

The quantities m<p(a) := lim n->oc m<p(a,N) and d-p := lim tv-s>oo djr(Z N ) stem- 

JV prime JV prime 

ming from these results depend significantly on whether the systems of forms involved 
are invariant. We say a system <P is invariant if <P(Q D ) is invariant under translations 
by constant vectors, that is if $(Q D ) + (!,...,!)= <P(Q D ). 
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Regarding the quantity m<p(a), it is a highly non-trivial fact related to Szemeredi's 
theorem that if <P is invariant then an analogue of the Varnavides version of Roth's 
theorem holds, namely S$(A) > c<p(a) > for any set A C Z N of density at least 
In particular, we have ra#(a) > for every a > 0. By contrast, if <P is not 
invariant, then there exists a > such that for each large prime N there is a subset 
of Ztv of density at least a with no ^-configurations whatsoever; see Lemma 18.51 In 
particular we have m$(a') = for all a' < a. The problem of estimating this function 
m$ for a general $ is of course an extension of the well-known problem of improving 
the bounds for Szemeredi's theorem. 

On the other hand, the limit dj is of interest mainly for families J 7 consisting only 
of non-invariant forms. Indeed, even if one weakens the definition of $-free sets to allow 
them to contain certain trivial configurations, such as constant vectors, it follows from 
Szemeredi's theorem that djr = whenever J 7 contains an invariant form. By contrast, 
for families T consisting only of non-invariant forms it is easy to see that dj > (see 
Lemma T8.5P . though not much is known concerning the exact value of this constant. It 
would be interesting to understand this quantity in general; see [21J for some results of 
Schoen in this direction. 

Let us now briefly describe the ideas underlying the above theorems. Croot's proof 
of Theorem 11.21 consisted essentially in showing that, given an arbitrary set in Z p , there 
exists a set in Z g having roughly the same solution measure for the system of forms 
corresponding to 3-term progressions, provided p and q are sufficiently large primes 
with q ^> p. We shall follow the same broad strategy for Theorem 11.31 and will combine 
this with a so-called arithmetic removal lemma to obtain Theorem 11.51 To state the 
main result underpinning this strategy, we use the following definition. 

Definition 1.6 (Size of forms). We say that a system <E> = (ipi, . . . , ip t ) of linear forms 
<Pi, . . . , (fit '■ Z D — > Z has size at most L if D, t < L and the coefficients of each ^ have 
absolute value at most L. 

Our main result can now be stated as follows. 
Theorem 1.7 (Periodic transference). 

Let L > 1 be an integer and let 5 £ (0, 1). Then for any primes p,q > Nq(5, L) and any 
set A C Z p; there is a set BCZ, such that, for any system <& of linear forms of size at 
most L, any two of which are linearly independent, one has \S$(A) — S$(B)\ < 5. 

2 This can be seen to follow easily from [TJ1 Theorem 1] . 
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Note that in particular the densities \A\/p and \B\/q of the sets are very close. This 
theorem will actually be a simple consequence of the following functional version, where 
we write S$(/ : Zjv) to emphasize the domain of a function /. 

Theorem 1.8 (Periodic transference, functional version). 

Let L > 1 be an integer and let 5 G (0, 1). Then for any primes p,q > N (d~, L) and any 
function f : Z p — >■ [0, 1], there is a function f':Z q —t [0, 1] such that, for any system <& 
of linear forms of size at most L, any two of which are linearly independent, one has 

\S^(f:Z p )-S^f':Z g )\<5. 

In fact, both of these results hold for p and q positive integers as long as these 
do not have small prime factors, so the restriction of iV to prime moduli in our main 
applications can be relaxed somewhat; this is discussed in Section [SJ 

The paper has the following outline. Section [2] provides background on unifor- 
mity norms, nilmanifolds, and polynomial sequences. In Section [3] we record an inverse 
theorem for the U d norm for functions on finite cyclic groups, in terms of periodic 
nilsequences, which follows from the main results of Szegedy in [23], and we state a 
corresponding regularity lemma for such functions. In Sections H] and [5] we develop 
variants for the periodic setting of the irrational regularity lemmas and counting lem- 
mas of Green and Tao [8]. With these lemmas in hand, it turns out that the main 
novel idea needed for the transference result above is a construction of a polynomial 
nilsequence with prescribed period and equidistribution properties; this is presented in 
Section O In Section [7] the transference result is proved and the combinatorial appli- 
cations above are finally given in Section [HJ We make some closing remarks in Section [91 

Acknowledgements. The authors are very grateful to Ben Green for initial con- 
versations that inspired this work. Both authors thank the EPSRC for the postdoctoral 
fellowships that supported their research, and also thank the Ecole normale superieure, 
Paris, and KTH, Stockholm, for welcoming them during the completion of this work. 

2. Background notions 

2.1. Gowers uniformity norms. One of the main tools used in this paper is an 
arithmetic regularity lemma, which decomposes an arbitrary bounded function on Zjv 
as a sum of a structured part and some error terms. The sense in which one of these 
terms constitutes an error is that it is small in a particular uniformity norm. These 
norms can be defined as follows. 
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Definition 2.1 (Uniformity norms). Let G be a finite abelian group, let d be a positive 
integer, and let / : G — > C be a function. We define 

Wf\C-=^,eG,he& II C^f(x + e-h), 

ee{0,l} d 

where C is the complex-conjugation operator, |e| = e\ + • ■ ■ + e d , and e ■ h = E\h\ H h 

£dhd- 

These norms were introduced by Gowers [5]. Their role in arithmetic combinatorics 
is by now well described in several sources; see for example [HI [21] • Here we restrict 
the discussion to the following standard facts. First, these norms are nested: \\f\\ud < 
H/ll^d+i for any d > 1. Second, they can be used to control solution measures, in the 
following sense. 

Theorem 2.2. For any integer L > 1 there are integers s = s(L) and Cl such that if 
$ is any system of integer linear forms of size at most L, any two of which are linearly 
independent, and N is a positive integer with no prime factors less than Cl, then 

\S (f) - S$(g)\ < L\\f - g\\ um+ i 

for any functions f,g : — > [0,1]. 

This result is tied to a family of results known in this generalized von 

Neumann theorems. The proof of this version is essentially contained in {9] and the 
result is also discussed in [25]. We note also the simple bound 

\S*(f)-S*(g)\<L\\f-g\\ Ll :=E x&N \f{x)-g{x)\ (1) 

provided iV is prime to at least one coefficient of each form in 

2.2. Nilmanifolds and polynomial sequences. This paper depends heavily on the 
work of Green and Tao [H [10] on the quantitative behaviour of polynomial nilsequences. 
In this subsection we review the basic notation and concepts involved, so as to set this 
paper in a workable context, but we omit several details, for which we refer the reader 

to [sung. 

Definition 2.3 (Filtrations) . Let G be a group. We call a sequence G, = (Gj)i>o of 

subgroups of G a filtration on G of degree at most s if 

G = G = G x D G 2 D ■ ■ ■ D G s D G s+1 = G s+2 = ■■■ = {id G } 

and [Gi, G 3 ] C G i+ j for all i,j > 0. Here [g, h] := ghg^h^ 1 denotes the group commuta- 
tor of g, h E G and [A, B] denotes the subgroup of G generated by {[a, b] : a G A, b G B}. 
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Definition 2.4 (Nilmanifolds) . If G is a connected, simply-connected nilpotent Lie 
group and Y is a discrete, cocompact subgroup, we call G/T a nilmanifold. If G, is 
a filtration on G of degree at most s, and the Gi are closed and connected with the 
subgroups Tj := T fl Gi cocompact in Gi, then we call the pair (G/T, G 9 ) a filtered 
nilmanifold of degree at most s. We define the total dimension of such a nilmanifold to 
be the quantity $^* =0 dim 

Throughout the paper we shall write m for the dimension of G and for the di- 
mension of Gi, whenever it is obvious from the context to which groups we are referring. 

We also need the notion of a Mal'cev basis for a filtered nilmanifold (G/T, G 9 ). This 
notion was introduced in [IB] , and it is defined and discussed in Appendix |A] here. For 
now we note only a few salient facts. Such a basis provides a real-coordinate system on 
G that is consistent with T and G,, by means of the associated Mal'cev coordinate map 
ip : G — > M m , a diffeomorphism for which 

(i) ^(r) = z m , 

(ii) ip(Gi) = {0} m - m * x R m % and 

(iii) -?/> -1 ([0, l) m ) C G is a fundamental domain for G/T, that is for any g G G there 
exists a unique element of T, denoted [g], such that the element {g} := g [g]' 1 
satisfies ip({g}) E [0, l) m . 

Thus an element of G lies in T if and only if all its coordinates are integers, and in Gi 
if and only if its first m — m ; coordinates are 0. These coordinates are useful in many 
ways, for example in classifying certain homomorphisms on G and in defining a notion 
of distance on the nilmanifold (see Appendix lAl. 

Definition 2.5 (Complexity of a filtered nilmanifold). Let (G/T,G 9 ) be a filtered nil- 
manifold of degree at most s, and let X be a Mal'cev basis for G/T adapted to G.. We 
say that (G/T, G,, X) has complexity at most M if m, s and the rationalitjj^ of X are 
all at most M. 

In this paper a filtered nilmanifold will always come with a Mal'cev basis, but the 
basis may sometimes not be specified explicitly when it is clear from the context. 

We also need the notion of a subnilmanifold. Recall that a rational number is said to 
have height M if it equals a/b with a, b coprime and max(|a|, = M . Recall also that 
a subgroup G' of G is said to be a rational subgroup if T n G' is a cocompact subgroup 
of Cr' [ID]. We say that such a subgroup G' is M-rational, or has complexity at most 

3 See 10 , Defn 2.4]. 
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M in G (relative to X), if the Lie algebra g' of G' is generated by linear combinations 
of the elements of X with rational coefficients of height at most M. 

Definition 2.6 (Subnilmanifolds). Given a filtered nilmanifold (G/T,G 9 ,X) of degree 
at most s, a subnilmanifold of G/T of complexity at most M is a filtered nilmanifold 
(G'/V, G'„ X') of complexity at most M with each subgroup G[ in G' m being a rational 
subgroup of Gi of complexity at most M, where V — G' DT, and where each element of 
the Mal'cev basis X' is a linear combination of elements of X with rational coefficients 
of height at most M. 

Definition 2.7 (Polynomial sequences). Given a filtration G, of degree at most s on a 
group G, we define poly(Z, G m ) to consist of all maps g : Z — > G such that 

dfii' • • dfagin) G Gi for alH > and hi, . . . , hi, n G Z, 

where c\ is the difference operator given by dhg{n) := g{n + h)g{n)~ l . We call any such 
map g a polynomial sequence, or simply a polynomial. 

A very useful and non-trivial fact about poly(Z, G.) is that it forms a group under 
pointwise multiplication. This is referred to as the Lazard-Leibman theorem in [8]; see 
that paper and [T5], [16] for further details and references. We shall generally use this 
fact without mention. One also has quite a tangible description of polynomials via the 
following lemma. 

Lemma 2.8 (Taylor expansion). Let g e poly(Z, G,), where G, has degree at most s. 
Then there are unique Taylor coefficients g>j G Gi such that 

g{n) = gog?gP ■ ■ ■ gP 

for all n 6 Z, and, conversely, every such expression represents a polynomial sequence 
g G poly(Z, G»). Moreover, if H is a subgroup of G and g is H -valued then we have 
gi G H for each i~\ 

Proof. Except for the final claim, this is contained in [H Lemma A.l]; we also give a 
proof of a slight generalization in Appendix O Note that the g± may be found inductively 

by g ■= 0(0), gi := g l g{l), g, := (gogi • --g^-i Y l gU)i from which the final claim is 
clear. □ 

4 This final claim is of course essentially a generalization of the fact that a polynomial p : Z — > R is 
integer-valued if and only if its Newton series p(n) = ay + ain + • ■ • + a s (") has integer coefficients. 
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Definition 2.9 (Nilsequences) . We call a function / :Z->Ca (polynomial) nilsequence 
of degree at most s and complexity at most M if there is a nilmanifold (G/T, G,, X) of 
degree at most s and complexity at most M, together with a polynomial g G poly(Z, G,) 
and a Lipschitz function F : G/T — > C with ||-F|| L ip(;t) — such that f(n) = F(g(n)T) 
for all fiGZ. 



The Lipschitz norm is defined here in terms of a metric dc/v — dc/r,x on G/T by 
IIFII -IIFII + sun \ F W-F(yT)\ 

If \\up(x) — If IL+ su p 



ry^reG/r d G/r (xT,yT) 



The metric structure on G/T comes from a metric do = on G, defined to be the 
largest right-invariant metric on G for which the distance from x to the identity is at 
most H^O^Hoo) i- e - is bounded by its largest coordinate in absolute value. The distance 
dG/r{xT,yT) is then defined to be the infimum of dx(x',y') over all representatives 
x 1 G xT, y' G yT. See [TQl Defn 2.2] for more details. 

Finally, we need a definition for our periodic setting. 

Definition 2.10. Let (G/T,G.) be a filtered nilmanifold, and let N be a positive 
integer. We say a sequence g G poly(Z, G.) is N -periodic mod T if g(n + N)T = g(n)T 
for all n G Z. Occasionally we may drop the mention of the period and T, and simply 
refer to a polynomial as being periodic. We say a nilsequence F(g(n)T) (or an orbit 
(g(n)T)) is N -periodic if its associated polynomial g is iV-periodic mod T. Finally, we 
call an element fegGan iVt/i root mod T if # G T. 



3. A PERIODIC INVERSE THEOREM 



In recent years it has been a central objective in higher-order Fourier analysis to 
obtain a general result for the U d norms known as an inverse theorem. Roughly speaking, 
in one of its most useful forms this result should characterize a function on [N] := 
{1, 2, . . . , iV} having non-trivial U d norm as one having non-trivial correlation with some 
d — 1 step nilsequence of bounded complexity. Such a result was finally established by 
Green, Tao and Ziegler in [11J. An alternative approach to this inverse theorem was 
given by Szegedy in [23], using the theory of nilspaces developed by Camarena and 
Szegedy in [2], itself inspired by fundamental work of Host and Kra [T3]. The main 
results in [23] yield an inverse theorem for functions on a finite cyclic group, involving 
periodic nilsequences, which is crucial for this paper. To state this theorem we use the 
following notion. 
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Definition 3.1. Let Pi{N) denote the smallest prime factor of a positive integer iV 
(with pi(l) := 1). We say an infinite set M of positive integers has characteristic if 
Pi(N) — > oo as N — > oo through Af. We say a sequence of finite abelian groups (v4j) ieN 
of increasing size has characteristic if {\Ai\ : i G N} has characteristic 0. 

Remark 3.2. It is clear that M has characteristic if and only if for any integer n > 1, 
only finitely many N in Af are divisible by n. Thus a sequence (Aj)j 6 N of characteristic 
as above forms a group-family of characteristic in the sense of [23] . 

The version of the inverse theorem that we shall use is the following. 

Theorem 3.3 (Periodic inverse theorem). Let M C N be a set of characteristic 0, let 
s be a positive integer, and let 5 > 0. There exists M > such that if \\f\\uB+i(x N ) — $ 
for some function f : Zjy — > C with WfW^ < 1 and N G Af, then there exists an N- 
periodic polynomial nilsequence h of degree at most s and complexity at most M such 
that E neZN f(n)h(nj >c s (5)>0. 

This theorem follows from (the proof of) [231 Theorem 10]. (Note that from the 
proof of the latter theorem in [23] one indeed gets that the polynomial underlying the 
nilsequence h is iV-periodic mod V). 

By the same arguments as in [HI Section 2], using in particular that the sum or 
product of two iV-periodic nilsequences is again an iV-periodic nilsequence, we may 
deduce the following arithmetic regularity lemma. 

Theorem 3.4 (Periodic, non-irrational arithmetic regularity lemma). 
Let M C N be a set of characteristic 0, let s > 1 be an integer, let e > 0, and let 
J 7 : M + —7- M + be a growth function. There exists M > such that for any N G Af and 
any function f : Z^v — > [0, 1] there is a decomposition 

f /nil fsml funi 

of f into functions /* : Zjy — > [— 1, 1] such that 

(i) /nil is an N -periodic nilsequence of degree at most s and complexity at most M, 

(ii) ||/smi|| 2 < e, 

(hi) H/unfll^+i < l/J r (M) ; and 

(iv) /nil and / n n + / sm i ta£;e values in [0, 1]. 

This regularity lemma essentially allows us to reduce the study of S$(f) to that of 
S$(fmi), this being useful since we have more structural information about / ni i. Much 
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as noted in [8], however, it turns out that we shall need stronger information still on 
full' we shall require the orbit underlying the nilsequence to be highly irrational, a 
quantitative property which guarantees that certain higher- dimensional variants of the 
orbit are equidistributed. We develop the tools we need for this in the next section. 

4. Irrationality and the periodic counting lemma 

Using the regularity lemma, from a function / on Zjy we obtain a nilsequence 
/nii(^) = F(g(n)T), where the polynomial sequence g : Z — > G is iV-periodic mod T. 
In [TD], Green and Tao developed powerful machinery to understand polynomial orbits 
quantitatively, and especially when such orbits are equidistributed. This machinery was 
built upon in [5] , where a notion of irrationality was introduced that is useful for dealing 
with solution measures across linear forms. In particular, if a polynomial sequence g is 
highly irrational, then S$(F(g(-)T)) is very close to a certain integral involving F that is 
essentially independent of g. A generalization of this is what is called a counting lemma 
in [8]. We shall use a slight weakening of this notion of irrationality. Before we define 
this formally, which will take some preparation, we state the corresponding counting 
lemma by way of motivation. For this we use the following notation: given a sequence 
g : Z — > G and a system <L> = (<pi, . . . , <p t ) of linear forms (ft : Z D — > Z, we write 

/( n ) : = (#(^i(n)),...,#Mn))) 

for n G Z D (or n G Zjy), and we write G^/T® C G* /Y 1 for the so-called Leibman 
nilmanifold associated with (G/T, G,, X) and $3 

Theorem 4.1 (Periodic counting lemma). Let M,D,s,t be integers with 1 < D, s,t < 
M , and let {G/T, G,, X) be a filtered nilmanifold of degree at most s and complexity at 
most M . Let g G poly(Z, G u ) be A-irrational and N -periodic mod T, for some positive 
integer N. Let <L> be a system of linear forms (fi, . . . ,<p t : Z D — > Z with coefficients 
of magnitude at most M. Then, for any Lipschitz function F : (G/T)* — > C with 

ll^llup^*) < M > one has 

E neZ £F(/(n)r*) = / F + o A ^, M {l). (2) 

Jg(0) A G*/r* 

Here, as in JS], <?(0) A denotes the element (g(0), . . . ,g(0)) G G l , and the integral is 
with respect to the normalized Haar measure on the coset gify^G® /T® . When F has the 
form F(xi, ...,x t ) = F'(xi) ■ ■ -F'fa), the left-hand side of © is S (F'{g(-)T)). The 

5 This is defined in [S]; we do not need any information about it beyond its occurrence in Theorem 14. II 
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norm H-Flliipf.**) * s then easily related to H-F'H^, using the basis X 1 for (G/T) 1 implicit 
in the above theorem; this is all detailed in Appendix lAl 

Theorem 14. II is an adaptation of [SJ, Theorem 1.11] to periodic orbits. While the pe- 
riodicity assumption is not present in [8] , our assumption of A- irrationality is somewhat 
weaker than that of (A, iV)-irrationality used in that paperjj and the error term in (J2]) 
is independent of N. We shall deduce the above theorem from [SJ Theorem 1.11], but 



first we lay the groundwork for the definition of irrationality, Definition 14.61 Much of 
this groundwork follows [SJ, but we include the details as these are important for later 
results. Recall that Tj := V D G{. 

Definition 4.2 (ith-level characters). Let (G/T,G 9 ) be a filtered nilmanifold of degree 
at most s. We write GJ for the group generated by Gi + \ and [Gj, Gi-j] for < j < i. An 
ith-level character is a continuous homomorphism from Gi to R which vanishes on GJ 
and is Z-valued on ry We say that such a character is non-trivial if it is non-constant. 

Remark 4.3. Note that GJ is contained in Gi by the filtration property, and is a 
normal subgroup of G since each Gj is. Observe that for the lower central series fil- 
tration, these concepts are only interesting for % = 1, as for this filtration we have 
GJ ~D [Gi,G{-x] = Gi for i > 2. Note also that lst-level characters are precisely the 
(lifts of) horizontal characters in the sense of [10]. In [SJ ith-level characters are called 
i-horizontal characters. 



We next recall the notion of complexity for ith-level characters, which is defined 
in terms of Mal'cev bases (see Appendix |A] for the definition of these and their cor- 
responding coordinate maps). Given a nilmanifold (G/T,G 9 , X), we write ipi for the 
restriction of the Mal'cev coordinate map ip : G — > lR m to the ith-level part of G, 
that is ipi : Gi — > M ri is if) composed with the projection to coordinates indexed be- 
tween m — nii + 1 and m — m^+i, where, as usual, m = dimG and = dimGj, and 
Ti := ffij - m i+1 . The following lemma describing ith-level characters using this map is 
straightforward to verify. 

Lemma 4.4 (Frequency vector). With the notation above, any ith-level character £j : 
Gi — > R has the form £i(g) = k ■ ipi{g) for some k e r U i . 

6 We shall in fact call a polynomial sequence A-irrational if it is (A, iV)-irrational for some N. It may 
seem strange to talk about sequences that are both periodic and irrational. However, we use these 
terms only in the quantitative sense. A polynomial sequence can indeed be A-irrational and A^-periodic 
mod r as long as N is sufficiently large in terms of A. 



LINEAR FORMS ON CYCLIC GROUPS AND PERIODIC NILSEQUENCES 



13 



Definition 4.5 (Complexity). We define the complexity of an ith-level character £j 
(relative to X) to be \k%\ + • • • + \k n \ for the corresponding k G Z ri . 

The main definition of this section is then the following. 

Definition 4.6 (A-irrationality) . Let {G/T, G., X) be a filtered nilmanifold of degree at 
most s. We say that ^ G G{ is A-irrational if for every non-trivial ith-level character £j 
of complexity at most A we have £i(gi) ^ Z. We say that a polynomial g G poly(Z, G,) 
is A-irrational if its Taylor coefficient gj is A-irrational for each i G [s]. 

Thus, requiring g to be irrational amounts to requiring the Mal'cev coordinates 
of each of its Taylor coefficients not to satisfy any linear equation with small integer 
coefficients. 

We can now deduce our periodic counting lemma rather simply from the counting 
lemma of Green and Tao. 

Proof of Theorem \4.1\ Restrict [Hi Theorem 1.11] to P = [N] D and divide both sides of 
the conclusion by N D to get that for any (A, iV)-irrational polynomial h : Z — > G with 
h(0) = idc we have 



Now, a polynomial sequence is A-irrational if and only if it is (A, iVo)-irrational for some 
iVo, and thus (A, iV)-irrational for all iV > A^ . In particular the polynomial g given to 
us is (A, /ciV)-irrational for all large enough integers k, and so 



Decomposing [kN} D = {0, N, . . . , (k - 1)N} D © [N] D , the left-hand side of © is 

E no e{o,iv,...,(fc-i)v}oIE ne [ A rpF(/(n + n )r*) = E ne[N]D F (g (n)^) , 
the last equality being a consequence of the periodicity assumption on (g(n)T). Letting 



5. A FACTORIZATION THEOREM AND A STRENGTHENED REGULARITY LEMMA 

Given the relevance of Theorem 14.11 to solution measures of nilsequences, we now 
aim to strengthen our regularity lemma by adding the property of irrationality to the 
structured part. A similarly strengthened regularity lemma, in which the structured 
part takes the form of a so-called virtual nilsequence, was [HI Theorem 1.2], one of the 





(3) 



k tend to infinity in ([3]), we then obtain ([2]). 



□ 
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main results of that paper. In this section we show that in the periodic setting, when 
the period is prime, or more generally when it belongs to a given set of characteristic 0, 
we can do away with this notion of virtual nilsequences, obtaining the following result. 

Theorem 5.1 (Periodic regularity lemma with irrational nilsequence) . 
Let M C N be a set of characteristic 0, let s > 1 be an integer, let e > 0, and let 
J 7 : IR + — > R + be a growth function. Then there is a number M = O s e ^^(1) such that 
for any N G N with pi(N) > Nq(s, e, J 7 , Af) and any function f : Zjy — > [0, 1] there is 
a decomposition 

f /nil fsml funi 

of f into functions /# : Zjv — > [— 1, 1] such that 

(i) /nil is an N -periodic, J 7 (M) -irrational nilsequence of degree at most s and com- 
plexity at most M , 

(ii) ||ismi|| 2 < e, 

(hi) 1 1 /unfile < l/J r (M) ; and 

(iv) /nil o,nd / nil + / sm i take values in [0, 1]. 

We shall prove this by establishing strengthened analogues, for periodic nilsequences, 
of the factorization results in [HI Section 2]. The main factorization result, Lemma [5.61 
is an analogue of [8j Lemma 2.10] saying essentially that any polynomial sequence that 
is periodic mod T is irrational on some subnilmanifold of G/T, modulo factors from T. 
Theorem 15.11 then follows easily. 

We begin with some simple lemmas. 

Lemma 5.2. Let g e poly(Z, G.) and let q G Z. Define h G poly(Z, G.) by h(n) = 
g{qn). Then, for any i, the ith Taylor coefficient hi of h satisfies hi = g\ mod GJ , 
where gi is the ith Taylor coefficient of g. 

Proof. This follows from the proof of [5J Lemma A. 8], consisting of many applications 
of (applications of) the Baker-Campbell-Hausdorff formula. □ 

We shall apply this through the following simple consequence. 

Lemma 5.3. Let g G poly(Z, G,), let q G Z and suppose g(q'Z) C V. Then for each 
i > there is some 7, G such that g\ = 7^ mod GJ , where gi is the ith Taylor 
coefficient of g. 
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Proof. Set h(n) = g(qn). This is T- valued by assumption, and so its Taylor expansion 
is h{n) = 707™ • • -Js for some 7, G Tj, by Lemma 12.81 Applying Lemma 15.21 we are 
done. □ 

Thus, modulo GJ , the ith Taylor coefficient of a g-periodic sequence with trivial 
constant term is a q l th root mod IV The following lemma, an analogue of [8j Lemma 
A. 7], lies at the heart of the factorization results in this section. 

Lemma 5.4 (Taylor-coefficient factorization). Let (G/T, G 9 , X) be a filtered nilmanifold 
of degree at most s, let A > 0, and suppose g G poly(Z, G,) satisfies g{q^) C T for some 
integer q with p\{q) > A. If g is not A-irrational then there is an index i G [s] such 
that its ith Taylor coefficient gi factors as g'ffi, where 7, G Tj and g\ lies in the kernel 
of some non-trivial ith-level character of complexity at most A. 

Proof. By definition of A-ir rationality, there is some % G [s] for which there is a non- 
trivial ith-level character & : Gj — > R with ^{gj) G Z. Let k G Z r , r := m 8 — m i+1 , be 
the frequency vector for £j given by Lemma I4.4[ so that 

£i(g) = k ■ ipi(g) for each g G G { . 

Now consider ). On one hand, this equals q l ^i{gi) G q % Z. On the other, by Lemma 
15.31 we have g\ = 7 mod GJ for some 7 G Y i: and since & annihilates GJ we have 

e t (^f) = ^(7) = A;-^(7)ehcf(A;)Z, 

^(7) lying in Z r by the definition of Mal'cev bases (see Appendix |A|. Hence q % £,i{gi) G 
lcm(g*, hcf (/c))Z = hcf(fc)g*Z, since hcf(fc) < A and g has no prime factors less than A 
by assumption. Thus ^{gj) G hcf(fc)Z, and so there is some vector t G II such that 
k-t = £i{gi). Letting 7* = ^(t), that is 7, = exp(tiX m _ mi+ i) • • • exp(t r X m _ mi+1 ), and 
setting g[ = g^ , we obtain the result. □ 

Remark 5.5. From failure of A-irrationality alone, that is with no periodicity-related 
assumption on g, one can still deduce a factorization g^ = g'ffi, but with 7^ being A- 
rationafl instead of actually lying in T. Thus one may remove the small term from 
[5J Lemma A. 7] in the A-irrational setting; in fact this version can also be deduced from 
[SI Lemma A. 7] itself by a compactness argument. We make further remarks on this at 
the end of this section. For our purposes, however, it is important that 7$ lies in T. 

From Lemma [5.41 one can then establish the following result. 
h is A-rational if h* S T for some 1 < t < A. 
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Lemma 5.6 (Factorization of periodic polynomials). Let {G/T,G,, X) be a filtered 
nilmanifold of degree at most s and complexity at most Mo, and let J 7 be a growth func- 
tion. Let g G poly(Z, G m ) satisfy g{0) = idc and g{q£) C r, where Pi(q) > O^ 0) j-(l). 
Then for some M G [Mo, Om ,j(1)] there is a factorization g = g' '7 with the following 
properties. 

(i) g' G poly(Z, G' 9 ) is J 7 (M) -irrational, for a subnilmanifold {G'/T', G' 9 , X') of 
G/T of complexity Om(X), g'{0) = idc an d g'^qL) C V . 

(ii) 7 G poly(Z, G m ) is Y -valued. 

In particular, if g is q-periodic mod T then we have g = sg'j, where e = {g{0)} is a 
constant, g' and 7 have the above properties and furthermore g' is q-periodic mod T'. 

The deduction of this result from Lemma 15.41 is completely analogous to the deduc- 
tion of [SJ Lemma 2.10] from [SJ Lemma A. 7], so we defer it to Appendix iBl 

Remark 5.7. We are mainly interested in periodic polynomials in this paper, but it 
is interesting to note the wider applicability of Lemma 15.61 Indeed, subject to the 
normalizing condition g(0) = ida, the assumption g{q'L) C V is strictly weaker than q- 
periodicity mod V: consider for example a sequence g n h n where g, h are gth roots mod 
T. Furthermore, among the polynomials g satisfying g(0) = idc, those with g(q%) C F 
form a subgroup of poly(Z, G,), whereas those that are g-periodic mod T do not. Note 
also that the factorization theorem as stated above applies to finite products of periodic 
polynomials with trivial constant term, even if they have different periods. 

Proof of Theorem \5.1[ We begin by applying Theorem 13.41 to / with a function J-q that 
grows sufficiently quickly compared to J 7 , obtaining a decomposition 

f /nil fsml funf 

and an integer M = O s>ei jr ^(l) such that 

(i) / n ii : Z N — > [0,1] is given by f n ii(n) = F (g(n)T) for some Lipschitz function 
Fq : G/T — > C on a nilmanifold {G/T, G m , X) of degree at most s and complexity 
at most M , H-PollLip(A') — an d 9 e P°ly(Z, G,) is iV-periodic mod T, 

(ii) || /ami 1 1 2 < e ? and 

(iii) H/unfll^+i < l/7b(M ), 

as well as the other properties in that theorem. We then apply Lemma 15.61 with an- 
other growth function T\ (assuming pi{N) > Om ,Ti{^)) t° obtain a number Mi G 
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[Mo, Om ,F\{^-)] an d a polynomial g' G poly(Z, G' m ) that is J r 1 (M 1 )-irrational and N- 
periodic mod I~" in some subnilmanifold (G'/V, G' m , X') of G/Y of complexity 0^(1), 
and satisfies e g'(n)Y = g(n)Y. 

The nilsequence of the conclusion then consists of the function F : G'/V — > C 
given by F(xV) := F (exY), which has ||-^ 1 ||Li P (A'') = ^Afi(l) by [TOj, Lemmas A. 5 and 
A. 17], the nilmanifold (G'/Y',G',,X'), which has complexity at most 0^(1), and the 
polynomial g'. Since 

/niiH = F (g(n)Y) = F(g'(n)V) 

we see that the nilsequence f n a has complexity M < C(M{) relative to these data, 
for some growth function C. We thus pick F\(x) := F(C(x)) so that g' is JF(M)- 
irrational. In order to ensure part (Jm]) of the conclusion, it suffices to pick Fq so that 
Fo(M ) > F(M), which we can do since M = 0^(1) = Om 0> jt(1). Finally, of course 
M = O s ^jr^(l), and we are done. □ 

Remark 5.8. From the factorization of Taylor coefficients mentioned in Remark 15.51 
one can obtain (by the same iterative procedure as in [SJ Lemma 2.10]) a version of 
Lemma [5.61 for arbitrary polynomials (with no periodicity assumption), where 7 is pe- 
riodic instead of T- valued, with period bounded in terms of F(M). Thus one may 
factorize an arbitrary polynomial g(n) into 

constant x highly A-irrational x boundedly periodic, 

doing away with the 'smooth' part of the factorization in [H Lemma 2.10]. The down- 
side of this (slight) simplification is that one has only A-irrationality and not (A, N)- 
irrationality, though this is catered for in the periodic setting by the corresponding 
counting lemma, Theorem 14.11 Let us also note that one can deduce a version of 
Lemma 15.61 from the above factorization, by dilating the variable n by some fixed inte- 
ger so that the periodic part becomes T-valued. However, one needs to ensure that this 
modification conserves irrationality, and the argument ends up being somewhat messier 
than the one presented above. 

6. Constructing a periodic, irrational polynomial 

Thanks to the regularity and counting lemmas, understanding a discrete average 
across some system of linear forms $ is essentially reduced to considering integrals of 
the form J G #/ r # F for Lipschitz functions F and bounded-complexity nilmanifolds G/Y. 
We now work in the converse direction: given such an integral, and some large period q, 
we want to approximate the integral by a discrete average involving some appropriate 
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g-periodic orbit. More precisely, we want to find a sequence g £ poly(Z, G,) that is 
g-periodic mod T and has its orbits equidistributed in G # /r # . To this end, in view 
of the counting lemma, we shall find a highly irrational g. 

Proposition 6.1 (Existence of a periodic, irrational polynomial). 

Let (G/T, G„ X) be a filtered nilmanifold of degree at most s and dimension m. Then 
for any integer q > (2A) m with pi(q) > A there exists g £ poly(Z, G,) that is q-periodic 
mod T and A-irrational. 

Our proof of this proposition occupies the remainder of this section. The main 
difficulty behind the result is that g-periodicity is in general not straightforwardly char- 
acterized in terms of Taylor coefficients, the objects central to the notion of irrationality. 
However, there are some instructive cases in which g-periodicity is easily related to these 
coefficients. For instance, if g(n) = g™ is linear, then g-periodicity simply corresponds 
to g\ being a gth root mod T, and it is not hard to construct an irrational gth root. For 
general nitrations, however, it is impossible for linear polynomials to be irrational, since 
for these polynomials any Taylor coefficient gi with % > 2 is trivial. Another case is when 
the group is abelian; for example a polynomial g{n) = a + a x n + ^2(2) + ' ' ' + a s(") 
over R is g-periodic mod Z for a prime g > s if and only if is a gth root mod Z for 



each i > 1, i.e. Oj £ Z/g j] In general, however, each Taylor coefficient g^ being a gth 
root is not sufficient for g to be g-periodic; it is not hard to construct examples of this 
using real 4x4 upper-unitriangular matrices. 

In both cases above, what yields the simple characterization of periodicity is the 
ambient commutativity, which fails in the general setting. Taking heed of this, our 
proof builds up the desired polynomial sequence iteratively, starting in the degree 1 
setting of Gi/G 2 , and working at stage i essentially with Gi/G i+ i, thus benefiting from 
commutativity at various points of the construction. 

The irrationality input will come from the following lemma. (Recall the notation 
r* = to* - m i+1 .) 

Lemma 6.2. Let (G/T,G,, X) be a nilmanifold of degree at most s and let i £ [s]. Let 
A > and let q > (2A) Ti be an integer with p\(q) > A. Then, for any h £ Gi, there 
exists w £ Gi that is a qth root mod Ti such that the product hw is A-irrational (in Gi). 

Proof. For any 7 £ Ti, there is a w £ Gi for which w q = 7, namely w = exp(Mog7). 
We shall thus focus on picking 7 instead of w. Now, hw is A-irrational if and only if 



Also, as explained in [8, Appendix A], irrationality in this example consists in a s not being a rational 
with small denominator. 
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£(hw) £ Z for any non-trivial ith-level character £ of complexity at most A, that is iff 
£(/i 9 7) ^ qli, which will certainly be the case, by Lemma l4.4[ if 

k ■ ^(7) 7^ a k mod q for any k G Z r with < \k\ < A, 

where the are some real numbers coming from the £(/i 9 ), and r := r^. All we need 
to do, then, is pick an integer vector t £ Z r such that k ■ t 7^ a& mod q for any such 
k, after which we simply set 7 := ip^ (t). But we can do this by a simple counting 
argument: for any k G Z r with hcf(/c,g) = 1, there are precisely solutions t G [g] r 
to k ■ t = ak mod g. Since there are at most (A + l) r — 1 vectors to be considered, 
provided q r > (A + l) r g r_1 and g only has prime factors bigger than A, there will be 
some t G [q] r such that k ■ t 7^ mod g for any with < \k\ < A. □ 

Remark 6.3. The element hw produced above is actually irrational in a stronger sense 
than the lemma suggests: it satisfies £(hw) ^ Z even if £ is not required to vanish on 
the groups [Gj, Gi-j] (as zth- level characters in general are). 

We shall also require the following lemma on the Taylor coefficients of a polynomial 
with restricted derivatives. 

Lemma 6.4 (Taylor coefficients of differentiated polynomials). Let g G poly(Z, G,) 
where G, has degree at most s. If ■ ■ ■ d^gin) G Crj+i for all i > and hi, . . . , hi, n G 
Z, then we have gi G G i+ \ for each Taylor coefficient gi of g. 

Proof. This follows essentially from the fact that Lemma 12.81 remains valid under the 
weaker assumption that G, is a prefiltration^ rather than a filtration, as recorded 
in Appendix O Indeed, the assumption on g ensures that it lies in poly(Z, G+ 1 ), 
where G+ 1 is the prefiltration (Gj + i)i>o of degree at most s — 1. Thus we can write 
g{n) = g g™ ■ ■ ■ ggL t for some g^ G G i+ \ by Lemma IC. II The result now follows by the 
uniqueness of Taylor coefficients. □ 

We can now prove the main result of this section, following essentially the above- 
mentioned iterative process. 

Proof of Proposition ^, li At stage i of the proof we obtain a polynomial g G poly(Z, G,) 
such that g(n + q)~ l g{n) G V ■ Gj+i for all n G Z and such that the first i + 1 Taylor 
coefficients of g are A-irrational. We shall then be done after stage s. 

For i = we set g(n) = idc for all n; this trivially satisfies the required properties 
since there are no non-trivial Oth level characters. Suppose, then, that we have a 



9 Following [10 , a prefiltration is like a filtration but with the weaker requirement G 3 Go 2 G±. 
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polynomial g G poly(Z, G m ) such that n \-t g(n + q)~ l g{n) takes values in Y ■ Gi and 
such that go, g±, . . . , gi-\ are A-irrational; we shall use this to produce a new polynomial 
g with gin + q)~ l g{n) G V ■ Gi + i for all n and with gj being A-irrational for j = 

0, 1, . . . ,i. We may suppose that g(n) = g\ ■ ■ •<7 f _ 1 • Set = g(n + q)~ 1 g(n); this 

is a polynomial map, and by assumption it is T ■ Gj-valued. By Lemma 12.81 we can 

(•- ) 

therefore write h{n)Gi = 7o7i • "Ti-i for some 7j G T^, which we take to be the 
Taylor coefficients of a polynomial 7 G poly(Z, G m ). Thus we may factorize 

h(n) = j(n)h(n), (4) 

where h G poly(Z, G.) is Gj-valued. We shall attempt to cancel out the contribution of 
this Gj-valued part h, and for this we need some information about its Taylor coefficients. 
First we have hj G Gi for all j, by Lemma 12.81 Then, looking at (jl]) mod Gj+i and 
using the centrality of Gi mod G i+ \ we have 

hoK ■ ■ ■ hPc i+1 = 7 oM7iM n • • • (ji-ihi^)(^hPG i+l , 

whence hi = hi mod Gi + \. But h is a 'differentiated' polynomial, h = d q g~ x , and so 
ddi ■ ■ -d^hin) G Gi+i for all di, . . . ,di,n G Z, whence hi — and so hi — is trivial mod 
G i+ i by Lemma EH 

We are now almost ready to produce our new polynomial: it will be g(n) := 
g(n)£(n)w(^ for some Gj-valued polynomial £ G poly(Z, G,) and some qth root w G Gi 
that we shall specify shortly. In fact we shall pick £ to be essentially an integral of (the 
inverse of) h: it will satisfy 

£(n)- 1 £(n + q)G t+l = h(n)G t+l . (5) 

We can obtain such an £ by picking its Taylor coefficients £1, . . . ,£i inductively to satisfy 
the system 

n = ht-i 

£1$ ■ ■ ■ ^ = h . 

This yields coefficients £j G Gi since each hj lies in Gi. Since Gi is central mod Gi+i, 
§D is easily seen to hold using the identity ( n + 9 ) - (") = ( « J (™) . We then have 

fif(n + ?)- 1 5(n)G i+1 = 7 (n)^(n)€(n + ^-^(nJiuW-W^+i G w ("H"^)r • 
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Thus g{n + G V ■ G i+ i for all n as desired provided w G Gj is a gth root mod 

Tj. But we also need go, . . . , gi to be A- irrational. To this end, note that for each j < i 
the Taylor coefficient gj is automatically A-irrational since it is congruent to gj mod 
Gi, and so we need only consider Now, mod Gi + i we have cji = £iW] we thus pick 
w G Gi to be a gth root mod for which i^w is A- irrational, as we may by Lemma IfOl 
and the proof is complete. □ 



7. Transference: moving from Z n to Z m 

We now have all the tools required to prove Theorem ll.8[ the result lying at the 
heart of the combinatorial applications in this paper. We shall in fact prove the following 
mild generalization. 

Theorem 7.1 (Periodic transference). 

Let M C N be a set of characteristic 0, let L > 1 be an integer and let 5 G (0,1). 
Then for any N G N and M G N with pi(N), pi(M) > N (5, L,J\f), and any function 
f : Ztv — > [0,1], there is a function f : Zm — > [0,1] such that, for any system <P of 
integer linear forms of size at most L, any two of which are linearly independent, we 
have \S$(f : Zm) — S${f '■ Zjv)| < 6. 

Proof. Let e > and T : 1R + — > 1R + be a growth function, both to be specified in terms 
of S and L later, and let s = s(L) be as given by Theorem 12.21 Assuming pi(N) > 
O s>e) j7 t j\f(l), we apply Theorem 15. II to / to obtain a decomposition / = / ni i + / sml + / un f 
and an integer Q = 0^,7-^(1) such that 

(i) /mi = F{g{ n )F), where (g(n)T) is an iV-periodic polynomial orbit on some 
nilmanifold (G/T, G,, X) of degree at most s and complexity at most Q, g G 
poly(Z, G,) is J-"((5)-irrational, and F : G/T —> C satisfies II-FIIlip^) < Q, 

(ii) \\fsmi\\ 2 < e, 

(iii) H/unfllt/s+i < l/J^iQ), and 

(iv) /nil and / nii + / smi take values in [0, 1]. 

Furthermore, since / n ;i is [0, l]-valued, we may assume that F is real-valued by taking 
real parts, and then by replacing it with max(min(F, 1), 0) we may in fact assume that 
it is also [0, l]-valued; neither of these operations can increase || -^|| L ip (^f ) • 

Now let <P = (if i, . . . , (ft) be any system of pairwise independent linear forms fi : 
Z D — > Z of size at most L. Then by Theorem 12.21 and ([1]) we have 

\SM)-MUi)\ <Le + L/JF(Q). 
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The parameter e will play no further role, so let us fix already e = 5/3L. 

We now deal with S$(f n n) using the periodic counting lemma, Theorem 14.11 The 
Lipschitz function we use is F®' : G t /Y t — > C, given by F^ t (x\, . . . , x t ) = F(xi) ■ ■ ■ F(x t ); 
note that this has H-F^llLip^*) — Cq,l by Lemma IA.41 Applying the counting lemma 
to this function, with parameter Q' := max(L, s(L), Q, Cq } l) instead of Q, we obtain 

S*(U) = E n6 ^F^(/(n)r*) = / F m + o HQ) ^ QtL {\). 

J 9 (o) A G*/r« > 

We can choose T with sufficiently fast growth in terms of 5 and L to then have 



^o('o) A G*/r« > 



< 28/2,. 



/ g (o) A G*/r« > 

We now transfer to the group Z^. Let A = A(Q') be large enough so that the error 



term in Theorem 14.11 is at most 5/3, and let h G poly(Z, G,) be the M-periodic, A- 
irrational polynomial given by Proposition 16. 11 noting that we may assume h(0) = g{0); 
provided pi(M) is large enough we can find this. Theorem 14. II then gives us 

/ F®* — E n62 D F®*(/i*(n)r*) <S/3. 

lg{0) A G*/r* 

Setting /' : Zm — > [0,1], f'{n) = F(h(n)T), the expectation above is precisely S&(f), 
so that |S$(/') — S$(f)\ < 5 as required. □ 



8. Applications 

We are now ready to establish Theorems 11.31 and 11.51 We begin with the former, 
which we restate now in the stronger form that was mentioned in the introduction. In 
the version below, the primality restriction on N is replaced with the weaker requirement 
that N belong to a set of characteristic (recall Definition 13. ip . 

Theorem 8.1. Let<P be a system of integer linear forms, any two of which are linearly 
independent. Then for any a G [0, 1] there is a number m${a) such that the sequence 
m<p(a, N) — > m<p(a) as N — >■ oo through M , for any set J\f C N of characteristic 0. 

The proof involves the following standard result. 

Lemma 8.2. For any positive integer d and any function f : Z^r — > [0, 1] there exists a 
set ACZ N such that \\1 A - f\\ ud{ZN) <d iV~ 1/2d . 

This follows easily from [231 Example 11.1.17]; one can prove it by picking the set 
A randomly, letting each element x G Z^r lie in A with probability f(x) independently. 
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Proof of Theorem \8.1[ Suppose <L> has size at most L, and fix any a G [0, 1]. We first 
establish convergence for any fixed Af (to some limit possibly dependent on Af). 

Given such a set Af, fix an arbitrary e G (0, 1). Let iVo be the integer obtained by 
applying Theorem 17. II with 5 = e/2(L + 1), and let s and Cl be as given by Theorem 
E2J Let N, M be any elements of AT such that N, M > s (L/5f s+1 and p 1 {N),p x {M) > 
max{L, Cl, N q }, and let A be a subset of Z N of size at least aN satisfying S<p(A) = 
m$(a, N). Then Theorem 17.11 gives us a function /' : Zm —> [0, 1] with Ez M /' > a — 5 
and such that S$(f) < m$(a,N) + 5. Applying Lemma IH721 to /' with d — s + 1, we 
obtain a subset B' of Z, M such that — f'\\ut+uz M ) — $/L. We then have \B'\ > 
(a — 2S)M, since \E% M (f' — 1b')\ < \\f — Ifi' llf/»+i(z M )- Moreover, Theorem 12.21 gives us 
that S$(B') < m$(a, N) + 25. Now we add at most 2SM elements from r Lyi\B' to B', 
obtaining a set B C Zm of size at least aM which, by (pQ), satisfies S<p(B) < S&(B') + 
2SL. It follows that S$(B) < m,p(a, N) +2(L + 1)5, whence m#(a, M) < m#(a, iV) + e. 
Arguing the same way with and M interchanged, we obtain m$(a, N) < m$(a, M)+e. 
Thus (m$(a, ^)) Ne u- is a Cauchy sequence. 

Now if M' is another set of characteristic 0, then noting that M U N' is also of 
characteristic 0, we deduce that the limit of (m<p(a, N)) for Af' is equal to that for 
M. □ 

We now turn to Theorem ll.5[ which we shall establish in the following stronger 
form. 

Theorem 8.3. Let J 7 be a finite family of systems of integer linear forms, in each of 
which the forms are pairwise linearly independent. Then there is a number dj such that 
djr(Z N ) — > djr as N — > oo through Af, for any set Af C N of characteristic 0. 

We shall use the following result, known as an arithmetic removal lemma, which 
follows from the more general removal lemma of Krai, Serra, and Vena [14J. 

Theorem 8.4. Let <L> be a system of linear forms tpi, . . .,ip t : Z D — > Z. Then there 
exists a positive integer K such that the following holds. For any e > 0, there exists 
5 = 8(e,<P) > such that if N G N is prime to K, and Ai, . . . ,A t are subsets ofZ^ such 
that S&(Ai, . . . ,A t ) < 5, then there exist sets Ei C Z^r with \Ei\ < eN for all i G [t], 
such that S^At \ E l: . . . , A t \ E t ) = 0. 

The proof is a straightforward deduction, given in Appendix [El 
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Proof of Theorem \8.3\ . Let n be the cardinality of the given finite family J 7 , and let L 
be a uniform upper bound on the sizes of the systems in J 7 . As in the proof of Theorem 
18.11 it suffices to establish convergence for any given M. 

Having fixed Af, let us fix an arbitrary e > 0. Let 5 G (0, e/2) be such that Theorem 
18.41 holds for each <P G J 7 , with initial parameter e/2n. Now let C = C(e, L) be such that 
Theorem 17.11 and Lemma 18.21 hold with main upper bound S/2L, for any N,M G M 
with pi(N),pi(M) > C. We claim that \d F {Z N ) - d T (Z M )\ < e for any such N, M. 

To see this let a = dj{7L?q) and let A C 7L N be J-"-free with \A\ = aN. Then by 
TheoremOthere exists /' : Z M ->■ [0, 1] with \Ez u f -a\<6/2 such that S#(f) < 5/2 
for every $ G J 7 . Applying Lemma 18.21 to /', we obtain a subset B' of Zjy with 
\\1b' — /'Hys+i — 5/2L, where s = s(L) is as given by Theorem 12.21 As in the proof 
of Theorem 18.11 this then implies \at — \B'\/M\ < 5, and Theorem 12.21 also gives that 
S${B') < 5 for every <P G J 7 . 

Now, by our choice of 5, Lemma 18.41 applied to each <P G J 7 gives us an J-'-free 
subset B of B' with \a - \B\/M\ < 5 + e/2, whence d T (Z M ) > d T (Z N ) - e. Arguing 
similarly with N,M interchanged, our claim follows. Thus (^(Zjv))^ ^ is a Cauchy 
sequence. □ 

We close this section with the following result mentioned in the introduction. 

Lemma 8.5. Let J 7 be a finite family of non-invariant systems of integer linear forms, 
in each of which the forms are pairwise independent. Then djr > 0. 

Proof. After converting from systems of linear forms to systems of linear equations, this 
lemma follows immediately from a similar result for families of single equations. The 
latter result was recorded as [3., Proposition 1.4] and its proof consisted in a simple 
construction based on an idea employed by Ruzsa [20j Theorem 2.1]. To convert to 
systems of equations, then, assign to each <E> G J 7 an integer matrix A as in the proof of 
Theorem 18 .41 Note that the pairwise independence condition implies that any row of any 
such A has at least three non-zero coefficients. Labeling these matrices A 1; A 2 , . . . , A n , 
we now form a family of non- invariant integer linear forms L 1; . . . , L n , by defining the 
coefficients of Li to be the entries in a chosen row of Aj not summing to 0. It is clear 
that djr{/L N ) is always at least the maximum density of a subset of Z^v avoiding the 
equations Li(x) — 0, i G [n], so we are done. □ 
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9. Remarks 

If we consider vector spaces over finite fields instead of cyclic groups then the ana- 
logue of Theorem 11.81 is a more exact statement telling us that for any n and m > n 
we can transfer a function on ¥ n to one on F m having equal solution-measures for any 
<£>. In contrast with Theorem 11.81 however, the latter result is rather trivial due to the 
fact that F n can be embedded as a subgroup of F m . This indicates that the finite-field 
viewpoint is less useful here than it is for several other well-known problems in addi- 
tive combinatorics, the non-triviality of Theorem 11.81 being more strongly related to the 
cyclic group setting, in which there can be a complete lack of non-trivial subgroups. 

The questions of the convergence of the minimal solution measures and the analogues 
of djr are also interesting for the integral setting of [N] (once defined appropriately); 
indeed, the latter question was raised for single linear equations by Ruzsa [2D] • One 
may establish a transference result in this setting (and thus convergence) relatively 
straightforwardly from Green and Tao's results in [8J: one can follow the structure of 
the proof of Theorem II. 8[ except that to obtain /' : [M] — > [0, 1] (for M > N) one 
can simply extend the domain of definition of f n n, the main point being that since f n n 
equidistributes well already up to time N, it automatically does so up to time M as 
well. 

In the setting of Croot's original convergence result [I], there is a nice relation 
between the minimum and maximum possible counts of 3-term progressions in sets of 
various densities, thanks to the formula S , 3ap(v4) + 5 , 3ap(^4 c ) = 1 — 3a + 3a 2 for sets 
A C 7j N of density a (provided iV is odd). Thus a set has the minimal number of 3-term 
progressions for sets of density a if and only if its complement has the maximal number 
for sets of density 1 — a. From this it is immediate that the quantity M^\p(a, N) := 
max^cZiv, \A\<aN S3Ap(A) also converges as N — > oo over primes. There is a similar 
relation for solution counts of other single linear equations in an odd number of variables, 
but for more general systems of equations no such relation holds. Nevertheless, the 
methods of this paper do of course allow one to deduce convergence in this regime: 

Theorem 9.1. Let<& be a system of integer linear forms, any two of which are linearly 
independent. For any a G [0,1] and N G N, let M,p(a,N) := ma.XA_cz N , \A\<aN S$(A). 
Then M$(a, N) converges as N — > oo through primes. 

These quantities are very natural from a combinatorial perspective, as they capture how 
structured a set of a given density can be. It would be interesting to know more about 
these limits in general. 
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It would also be interesting to identify 'limit objects' on which to study quantities 
such as m$(a) and d$ directly, for a general system <L> of finite complexity. For systems 
$ corresponding to a single linear equation, the circle group is already known to be a 
suitable limit object, in that m^a) = m$(a, T) and d$ = d$(T), where these quantities 
are defined naturally in terms of measurable subsets of T (see [21 122])- For more general 
systems, one possibility would be to have, for each value of s G N, a single space X s 
on which d$ can be studied directly for any system <P of complexity at most s (in 
particular we would have X\ — T). One may expect to characterize such a space X s in 
terms of nilmanifolds of degree at most s. 

Finally, let us note that one may obtain periodic analogues of the equidistribution 
results of [TO], namely |10[ Theorems 1.16 & 1.19], by similar considerations to those in 
this paper; we omit the details. 



Appendix A. Mal'cev bases 



This appendix gathers some technical tools on Mal'cev bases and related notions. 

Definition A.l (Mal'cev basis). Let (G/Y,G,) be an m-dimensional filtered nilmani- 
fold. A basis X = {Xi, . . . ,X m } for the Lie algebra g over R is called a Mal'cev basis 
for G/Y adapted to G, if the following conditions are satisfied. 

(i) For each j G [0, m — 1] the subspace fyj := Span(X J+1 , . . . , X m ) is an ideal in g, 
and hence Hj := exp t)j is a normal Lie subgroup of G. 

(ii) For every i G [0, s] we have Gi = H m _ mv 

(iii) Each g G G can be written uniquely as exp^Xj exp(t 2 ^2) • • • exp(t m X m ), for 
U g R. 

(iv) T = {exp(*iXi) exp(t 2 X 2 ) ■ ■■exp(t m X m ) : U G Z}. 

The Mal'cev coordinate map ip : G — )■ M. m referred to in Section |5] is then just the 
map sending g G G to its corresponding tuple (t\, . . . , t m ) G W m from (jm]) above. 

Given a basis X on G/Y, the following result describes a natural Mal'cev basis for 
a power G t /Y t . This will enable us to relate Lipschitz norms on these two manifolds as 
needed in the proof of Theorem 17.11 

Lemma A. 2. Let {G/Y, G,) be an m-dimensional filtered nilmanifold of degree at most 
s with a Q-rational Mal'cev basis X = {X%, . . . ,X m }, and let t G N. Let Q f denote the 
direct sum oft copies of g, let X 1 C g l be the set of vectors X^j = (0, . . . , 0, Xj, 0, . . . , 0) 

10 This notion of complexity was defined in [B]. 
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where Xj appears at the ith entry, and let X 1 be ordered according to the colex order on 
[t] x [m]. Then X 1 is a Q-rational Mal'cev basis for (G* /Y l , G\). 

The proof is routine, being based on the fact that the exponential map from jj* to 
G* is given by (i^, ...,v m ) ^ (exp(vi), . . . , exp(t> m )) [27]. 

We now use the form of this basis to relate the metrics on G/Y and G t /Y t . Recall 
that, given a Mal'cev basis X for a nilmanifold (G/r,G,), the metric do = dp^c was 
defined as the largest metric d for which d(x,y) < ||'0( a; Z/~ 1 )ll O o f° r an x,y e G 



Lemma A. 3. Let X be a Mal'cev basis for (G/T,G.) and let X 1 be the corresponding 
basis for (G t /Y t , G*) given by Lemma lA.Sl Then, for any x, y G G l , 

d G t{x,y) > max<i G (xi,yi) and d G t /^(xY^yY 1 ) > maxd G / r (xiY, yT 1 ). 
»e[t] ie[t] 

In fact it is not hard to show (using [TO} Lemma A. 4]) that the metrics d G t mt^xY*, yY l ) 
and max iG [ t ] dc/v{xiY,yiY) on G t /Y t are Lipschitz equivalent with constant depending 
on the rationality bound for X, but we do not need this fact here. 

Proof. Write ip : G — > M m for the Mal'cev map on G corresponding to X, and : G* — > 
M mxt for the one on G l corresponding to X 1 . Thought of as a matrix, it is easy to see 
that 

ip t {xi, ■■■,x t )= {^{xxY . ..ip(x t y). 
From this it is immediate that the metric d'(x,y) := maxjgw d G (xi, yi) on G 1 satisfies 
d'{x,y) < ||V't( x Z/~ 1 )lloo for all x,y G G l . Since dot is the largest metric satisfying this 
condition, we have d'(x,y) < d G t(x,y) for all x,y G G*, which was the first claim. 

The claim for G t /Y t then follows immediately since, for any i G [t], 

d G t /r t(xY\yY t ) = inf d G t(x,yj) > inf d{x h yfji) = d G/T (x u yi). □ 
-ygr* 7er t 

Finally we relate this to the corresponding Lipschitz norms. Recall that 

IIFII -IIFII + sup \ F ^)-F{yY)\ 

\\ r \\Up{X) ~ W r Woo + bU P 



xT^yTdG/v d G / r (xY,yY) 



Lemma A. 4. Let {G/Y, G., X) be a filtered nilmanifold, let t be a positive integer, and 
let F : G/Y — > C be a function. Then, writing F m for the function (xi, . . . ,Xt) i-> 
F{ Xl ) ■ ■ ■ F(x t ) on the power (G t /Y t , G*, X 1 ), we have H^'Hl^) < *||^|lL P (;r)- 



That is, d(x,y) :— sup ie/ di(x, y) where the di are all the metrics satisfying the condition, whence 



di{x,y) < d(x,y) for all such metrics. 
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Proof. By telescoping it is easy to see that 



F( Xl ) ■ ■ -F(x t ) - F( yi ) ■ ■ -F(y t )\ < \\F\t 1 J^Ffo) - F( Vi )\. 



ie[t] 



Dividing by dct/r^x, y) and applying Lemma [A. 3 1 the claim follows. 



□ 



Appendix B. Factorizing non-irrational polynomials 

In this appendix we show how to deduce the full factorization result, Lemma [5.61 
from the analogous result for Taylor coefficients, Lemma [531 This is completely similar 
to the deduction of [H Lemma 2.10] from [SJ Lemma A. 7], but for completeness we in- 
clude the proof. The first step is to establish the following basic factorization, analogous 
to Lemma 2.9]. 

Lemma B.l (Basic factorization). Let (G/Y,G 9 , X) be a nilmanifold of complexity at 
most M, and let g G poly(Z, C7.) be such that g(0) = idc and g{q£) Q Y for some 
integer q with pi(q) > A. Then at least one of the following statements holds. 

(i) g is A-irrational in (G/Y,G,,X). 

(ii) There exists a factorization g = g'j, with g' G poly(Z, G'A such that g'(n)Y' 
takes values in a subnilmanifold (G'/V, G' m , X') ofG/Y of strictly smaller total 
dimension and of complexity Om,a(X), </(0) = idc, and 7 G poly(Z, G,) is 



Note that g' also satisfies g\q1) C Y 1 , indeed G' 3 g'(qn) = g(qn) r y(qn) 1 G Y. 
Similarly, if (g(n)Y) is g-periodic, then so is (g'(n)Y'). 



Proof. Assume g is not A-irrational, with Taylor expansion g{n) = g\ 1 g\ 




where gi G G{ for each i. Lemma 15.41 then implies that, for some i G [s], we have 
9i = 9' fiii where g\ G ker^j for some non-trivial zth-level character £3 : G — > R of 
complexity at most A, and 7^ G Tj. As in [8j, we shall now consider the cases % > 1 and 
i = l separately. 

/ \ ( n \ ( n \ 

For i > 1 we write g(n) = g<i{n) {g'ai)^> g>i{n), where g<i{n) = g g\ x) ■ ■ ■ g\ti 



where each g a is an iterated commutator of k\ = ki >a copies of g[ and ki = k 2 , a copies of 
7i, where k\, k% > 1 and k\ + k 2 > 2, and where Q a are polynomials of degree <ki + k 2 



Y-valued. 



and 0x(n) = g>+i ■ ■■gr. Now by (C.l)] we have 




a 
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with no constant term. It follows from this and the group property that Yl a 9a a is 

a G 2i -valued polynomial sequence. Therefore the sequence g>i{n) := [laffa^'^fciW 
is a Gj+i-valued polynomial sequence. We can then write 

( n ) ( n )~ 
g (n) = g {n)~/> iJ , where g (n) = g<i{n)g^^ g>i{n), 

( n ) 

and g>i{n) = [ji , g>i{n)]g>i{n) is a G i+ i-valued polynomial sequence. As in [8], then, 
we have that g' is in poly(Z, G' u ), where G'/T' = G/T and G'j = Gj for j ^ i, with 
G\ = ker(^j). Indeed, G' 9 is a filtration since G i+ i, [Gj, Gj-j] C ker(^) by definition, and 
g' is a polynomial adapted to G' t since Ski^)^ 4 ' is (by Lemma 12.81) and g>j is as well. 
This completes the case i > 1, since dim(ker(£j)) < dim(Gj). 

For i = 1 we can just set Gq = G' x = ker(£i) as the first two terms of our new 
filtration, since go = idc, and our factorization is then g(n) = g'{n)~/™, where g'{n) = 
g'xg>i{n), with g y i taking values in G' 2 . 

In both cases above, to obtain an appropriate Mal'cev basis for the new filtration, 
note that we can simply apply [TUl Proposition A. 10], ker£j being boundedly rational 
since & has bounded complexity. □ 

Recall that the complexity Mq of a filtered nilmanifold is a common upper bound 
for the dimension, the degree, and the rationality of the Mal'cev basis. In particular 
the total dimension J2i dim(Gj) is bounded by (s + 1)M < (M + 1)M . Recall also 
from Section ¥Z?2\ the definition of the "fractional part" {g} of g G G relative to T. 

Proof of Lemma 1 5. (k If the given sequence g is J 7 (M ) -irrational in (G/r,G.) then we 
are done; if not, then by Lemma IB. II we have g = gi^i where 71 is T- valued, g\ is 
G'-valued, where (G'/T',G' 9 ) is a subnilmanifold of (G/T,G.) of complexity Mi = 
Or{M )iX) and strictly smaller total dimension than (G/r,G.), and moreover gi(0) = 
idc and gi(gZ) C V . Now, if g\ is J 7 (Mi) -irrational in (G'/T',G' 9 ), then we are done; 
otherwise we apply Lemma lB.ll again to g±. Carrying on this way, the process must 



stop after at most Om (1) applications of Lemma TB.lt by the initial bound on the total 
dimension of {G/T, G.), and the full factorization follows. (Note that to be able to apply 
the lemma enough times, we need pi {q) greater than the final irrationality requirement 
we may end up with, which is jF{Mj) = Ojr Mo {T) for some Mj as constructed above.) 

For the final claim in the lemma, note that if g is g-periodic mod T then the first part 
of the lemma applied to the sequence {g(0)} _1 g [g(0)] -1 yields the claimed factorization 
g = Eg' 1 . □ 
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Appendix C. Polynomials with respect to prefiltrations 

In this section we record some facts about polynomials with respect to prefiltrations. 
By a prefiltration of degree at most s in a group G we here mean a sequence (Gj) of 
subgroups of G with G 5 G 5 D ■ • O G s D G s+1 = {id G } and [G i: Gj] C G i+j 
for all i,j > 0. The definition of poly(Z, G.) extends to prefiltrations with no change, 
consisting of all the maps g : Z — > G such that ■ ■ ■ d^gin) G Gj for all i > and 
hi, . . . , hi,n G Z. Moreover, as with nitrations, this space forms a group, as follows 
immediately from [TU1 Proposition 6.5]. We also have the following version of Lemma 



Lemma C.l (Taylor expansion for prefiltrations). Let g G poly(Z, G,), where G, is a 
prefiltration of degree at most s. Then there are unique coefficients gi G Gi such that 



for all fiGZ. 

There are several ways to prove this; we follow a natural induction along the lines 
of Leibman [TTl §4.7] that makes use of the following lemma. 

Lemma C.2. Let g,h G poly(Z, G.) where G. is a prefiltration of degree at most s. If 
g(n) and h(n) agree for n = 0, 1, . . . , s then they agree for all n. 

Proof. This follows immediately by induction by considering the polynomials dig,dih. 



Proof of Lemma lCJl Let go := g(0) G Go- Suppose g , . . . ,gi have been found so that 

(™) ( n ) 
gj G Gj, g(n) = g ■ ■ ■ g\ l! for n = 0, . . . , i, and g{n)G i+ i = g ■ ■ ■ g-' J G i+ i for all n G Z; 

note that this holds for i = since g G poly(Z, G,). We then define 



so that g i+ i G G i+ i. Then certainly g{n) = g ■ ■ ■ g\ for n = 0, 1, . . . , % + 1. But 

the polynomials g(n)G i+2 and g ■ ■ ■ g> gf+i G i+2 lie in prefiltrations of degree at most 
i + 1 and are therefore equal for all n by the above lemma, allowing us to move on to 
the next stage of the construction. We are done once we have go, . . . ,g s , since G s+ i is 



PI 



g{n) = gog^g?' ■ ■ ■ gY 



□ 




trivial. 



□ 
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Appendix D. On the pairwise-independence condition 

In this section we examine to what extent the pairwise-independence condition on 
the linear forms is needed for our main convergence results. First we note that these 
results do hold for systems of two linearly dependent forms. 

Lemma D.l. Let $ consist of two integer linear forms (fx, ip2 with (p 2 — ktpi, for some 
integer k (jL {0, 1}. Then as p — > oo through the primes we have d<p(7L^) — > 1/2 and 
m<p(a,p) — > m<p(a), where m$(a) equals for a < 1/2 and 2a — 1 for a > 1/2. 

Proof. Let us start with d$. For p large, it is easy to see that A is <P-free if and only 
if A fl (A; • A) = 0, and we can construct such a set of density asymptotically 1/2 
relatively easily. Indeed, if k — — 1 simply let A = \{jp — l)/2]. Otherwise, let n be 
the order of k in the multiplicative group Z* Let H be the multiplicative subgroup 
{k 3 : j G [n]}, with Z* = U je [ m ]yj ■ H, where m = (p - l)/n = O k (p/\ogp). Let 
E = {k 2j : j G [[n/2\]} C H, and define A = U i6W % • We have A n (fc • A) = and 
|A|/p = 1/2 + O fe (l/ logp). On the other hand, clearly A n (fc ■ A) 7^ for any set A of 
size at least (p+ l)/2. Hence rf<p(Z p ) — >■ 1/2. 

Regarding m$(a,p), note first that for a < 1/2 any subset of density a of the set 
A constructed above shows that m<p(a,p) = 0. For a > 1/2, note the relationship 
S&(A') = 1 — 2a + S#(A) between a set A C Z p and its complement A', as follows from 
the bilinearity of S$. Since the S^(A') term is always non-negative, letting A' C Z p be 
a $-free set of density 1 — a then gives A such that S$(A) = 2a — 1 = m^(a;,p). □ 

This proof provides a completely explicit extremal set A. As such, it is not obvious 
how to extend this result to systems of more than two forms, two of which are linearly 
dependent, or indeed to finite families of systems, one of which consists just of two 
dependent forms. To prove convergence in this setup, it seems that one would instead 
want a transference result for systems of two dependent forms, that would be compatible 
with the transference results we already have for systems of finite complexity. We shall 
now show that such a hypothetical transference result cannot be based on the uniformity 
norms — at least not in the usual way. Indeed we shall construct, given a system $ of two 
dependent forms and any d > 1, a family of sets in Z p for which the solution measure S<p 
is not controlled by the U d normj^ The sets we shall consider are essentially so-called 
Nild Bohr sets in Z p (see [T2]) and were already used to similar effects in [S]. 



Such a result is somewhat folklore, but this seems a suitable place to record it. 
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Proposition D.2. Let <L> consist of two integer linear forms fi,<f2 with (f2 = k(fi for 
an integer k with \k\ > 2. Let d > 1 and set S = l/4k 2d . Then for any prime p there is 



a set A C Z p such that 



(i) S<p(A) = 0, (ii) a := \A\/p = 25 + o p ^ 00;fc>d (l), and (Hi) \\1 A - a\\ ud 



o 



I). 



Proof. We shall assume that p is large, and for notational convenience we restrict to 
positive k. Let / denote the interval [[p/fc d J — Sp, \p/k d \ + Sp] in Z p , and set 

A = {x G Z p : x d G / mod p] . 

Note that S (A) = \An(k ■ A)\/p, and that if y = kx e k ■ A then y d e k d ■ I C 
[p — p/2k d ,p+p/2k d ]. But the latter interval is disjoint from / since p/2fc d < [p/k d \ -Sp, 
hence the first property of the conclusion holds. 

To establish the other two properties we shall use the Fourier transform, defined as 
f(r) = E X £% f(x)e(—r ■ x)o First note that by Fourier inversion we have 



l A (x) = h(x d ) = J2^ e ( t 
t 



X 



(6) 



We shall use this expression together with the two standard estimates Ylt l-fW = 
O(logp) and ||e(t • x d ) \\ ud = 0(p~ 1 ^ 2d ) for t non-zero, the latter being essentially a 
Weyl differencing estimate; see e.g. [2U Ex 11.1.12]. For (ii), then, we have 



a = E x l A (x) = \I\/p + J2 h(t)^e(t 



X 



and the latter expression is at most X^t^o l-K^) ' |^z e (^ ' ^ | = 0(p ^^logp). For 
(iii), coupling (J6} with the U d triangle inequality yields 



\U-a\\ ud < ^lfj(t) \\e(t-x d )-E y e(t-y d )\ 



U d 



OCp-^logp), 



and we are done. 



□ 



The set A given by this proposition is thus virtually indistinguishable from the 
constant function a from the point of view of the U d uniformity norm, whereas S<p(A) 
and S$ (a) are not at all close. 



13 



Here as usual r ■ x = rx/p, and e(6) = exp(2iri6) for any 6 G T. 
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Appendix E. The arithmetic removal lemma 

In this last appendix we establish Theorem 18. 44 which we restate here. 

Theorem E.l. Let <P be a system of linear forms ipx, . . . ,ip t : Z D — > Z. Then there 
is a positive integer K such that the following holds. For any e > 0, there exists 
5 = 5(e, <L>) > such that if N G N is prime to K, and Ai, . . . , A t are subsets of Z^ 
such that S$(Ax, . . . , A t ) < 5, then there exist sets Ei C Z^ with \Ef\ < eN for all 
i G [t], such that S <P (A 1 \ E u . . . , A t \ E t ) = 0. 

Proof. This will follow from the removal lemma [Tl| Theorem 1] of Krai', Serra and Vena 
provided we can find a homomorphism (or matrix) A : Z* — > Z fc such that ker^ A = 
0(Zjv), since then S^(A 1 , . . . , A t ) = \A 1 x ■ ■ ■ x A t n ker Zjv A|/|ker Zjv A|. Here 

ker Zjv A = {y + NZ l G H N : A(y) G NZ k }, 
$(Z%) = {$(x) + Nl} : x G Z D }. 

We construct such a A in stages. First let / : Z* — > Z'/^(Z D ) be the quotient map 
x H- x + $(Z D ). The target of this map, being finitely generated, is isomorphic to 
Z := Z fe x Zn 1 x ■ ■ ■ x Zjv r for some integers k > and Nj G N; let g be a corresponding 
isomorphism. Assume is prime to each Nj] then N ■ Z = (NZ k ) x Z Nl x ■ ■ ■ x Z^ r . We 
claim that A := 7r o g o / satisfies the required relationship, where ir denotes projection 
to Z fc . Indeed, writing A © H = {a + H : a G A} C G/H for a set A C G and a 
subgroup H < G, we have 

A~ 1 (NZ k ) = r 1 {g~ 1 {NZ)) = f- 1 {NZ t © <£(Z D )) = A^Z* + <P(Z D ), 

the second equality following from g being an isomorphism. Reducing mod A^Z* gives 
the required relationship. □ 
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